AdaptIE - Using Domain Language Concept to Enable Domain Experts in Modeling of Information Extraction Plans

نویسندگان

  • Wojciech M. Barczynski
  • Felix Förster
  • Falk Brauer
  • Daniel Schuster
چکیده

Implementing domain specific Information Extraction (IE) technologies to retrieve structured information from unstructured data is a challenging and complex task. It requires both: IE expertise and domain knowledge, provided by a domain expert who is aware of, e.g., the text corpus specifics and entities of interest. While the IE expert role is addressed by several approaches, less has been done in enabling domain experts in the process of IE development. Our approach targets this issue. We provide a base platform for collaboration of experts’ through IE modeling languages. We provide each of the experts with an IE language that is adapted to their respective expertise. IE experts leverage a fine grained view and domain experts can use a coarse grain view on IE plan. We use Model Driven Architecture concept to enable transition among the languages and a model of algebraic IE framework. To prove applicability of our approach we realized AdaptIE and demonstrate it in a real world scenario for the SAP Community Network.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Presenting a method for extracting structured domain-dependent information from Farsi Web pages

Extracting structured information about entities from web texts is an important task in web mining, natural language processing, and information extraction. Information extraction is useful in many applications including search engines, question-answering systems, recommender systems, machine translation, etc. An information extraction system aims to identify the entities from the text and extr...

متن کامل

RankIE: Document Retrieval on Ranked Entity Graphs

Developer communities built around software products, like the SAP Community Network, provide a knowledge base for reocurring problems and their solutions. Due to the large amount of content maintained in such communities, e.g., in forums, finding relevant solutions is a major challenge beyond the scope of common keyword-based search engines. In fact, it is measured that around 50% of the forum...

متن کامل

Topic Modeling and Classification of Cyberspace Papers Using Text Mining

The global cyberspace networks provide individuals with platforms to can interact, exchange ideas, share information, provide social support, conduct business, create artistic media, play games, engage in political discussions, and many more. The term cyberspace has become a conventional means to describe anything associated with the Internet and the diverse Internet culture. In fact, cyberspac...

متن کامل

The Investigation of the Perspectives of Iranian EFL Domain Experts on Postmethod Pedagogy: A Delphi Technique

After the introduction of postmethod pedagogy by Kumaravadivelu with its three principles of particularity, possibility and practicality, a wave of attention was directed towards this so-called 'postmethod era' and its appropriacy and adequacy in satiating the demands of the language learners in this 'brand new world'. This situation has created a healthy debate among the Iranian EFL community ...

متن کامل

A Supervised Method for Constructing Sentiment Lexicon in Persian Language

Due to the increasing growth of digital content on the internet and social media, sentiment analysis problem is one of the emerging fields. This problem deals with information extraction and knowledge discovery from textual data using natural language processing has attracted the attention of many researchers. Construction of sentiment lexicon as a valuable language resource is a one of the imp...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010